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Abstract — The coming decade of fast, cheap and 
miniaturized electronics and sensory devices opens new 
pathways for the development of sophisticated equipment to 
overcome limitations of the human senses. This paper 
addresses the technical feasibility of augmenting human 
vision through Sensing Super-position by mixing natural 
Hman sensing. The current implementation of the device 
translates visual and other passive or active sensory 
instruments into sounds, which become relevant when the 
visual resolution is insufficient for very difficult and 
particular sensing tasks. 

A successful Sensing Super-position meets many human 
and pilot vehicle system requirements. The system can be 
further developed into cheap, portable, and low power 
taking into account the limited capabilities of the human 
user as well as the typical characteristics of his dynamic 
environment. The system operates in real time, giving the 
desired information for the particular augmented sensing 
tasks. 

The Sensing Super-position device' increases the image 
resolution perception and is obtained via an auditory 
representation as well as the visual representation. Auditory 
mapping is performed to distribute an image in time. The 
three-dimensional spatial brightness and multi-spectral maps 
of a sensed image are processed using real-time image 
processing techniques (e.g.. histogram normalization) and 
transformed into a two-dimensional map of an audio signal 
as a function of frequency and time. 

This paper details the approach of developing Sensing 
Super-position systems as a way to augment the human 
vision system by exploiting the capabilities of the human 
hearing system as an additional neural input The human 
hearing system is. capable of learning to process and 
interpret extremely complicated and rapidly changing 
auditory patterns. The known capabilities of. the human 
hearing system to learn and understand complicated 
auditory patterns provided the basic motivation for 
developing an image-to-sound mapping system. 

The human brain is superior to most existing computer 
systems in rapidly extracting relevant information from 
blurred, noisy, and redundant images. From a theoretical 
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viewpoint, this means that the available bandwidth is not 
exploited in an optimal way. While image-processing 
techniques can manipulate, condense and focus the 
information (e.g., Fourier Transforms), keeping the mapping 
as direct and simple as possible might also reduce the risk of 
accidentally filtering out important clues. After all, 
especially a perfect non-redundant sound representation is 
prone to loss of relevant information in the non-perfect 
human hearing system. Also, a complicated non-redundant 
image-to-sound mapping may well be far more difficult to 
learn and comprehend than a straightforward mapping, 
while the mapping system would increase in complexity and 
cost This work will demonstrate some basic information 
processing for optimal information capture for head- 
mounted systems. 
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1. Introduction 

Humans rely heavily on vision to sense the environment in 
order to achieve a wide variety of goals. The expected 
payoff is to augment human sensory visual system, which is 
‘ deficient flight in many respects. 

An auditory technology, often integrated within helmet- 
mounted display systems technology has been limited to the 
use of radio communication. Modem research in 
stereoscopy had demonstrated the use of stereo sounds for 
human perception of the environment. In addition to sensor 
technology advancement, there has been a tremendous effort 
towards the development of newer display modalities. Such 
devices can be readily integrated in the pilot head-mounted 
systems to provide both pilotage and targeting imagery 
mappings. 

Over the past 30 y ears, there have been innumerable articles 
and scientific papers, which address the design and 
performance of helmet- and head-mounted systems. A large 
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portion of this paper is the result of a research capability to 
augment the current state-of-the-art in head mounted display 
systems (implicitly), rather than propose a new system. 
With the fielding of various human factors in military 
systems, research within this area of mapping visual to 
audio stereoscopy has been fairly new but effective [6]. 
While this paper is intended to provide .a fairly 
comprehensive overview of a particular project, the 
technology and its interface for a human perception, has 
numerous potential beyond the proposed ideas. 


2, Multi-Spectral Human Sensing 

Since the time of Aristotle (384-322 BC), human kind has 
been interested in perceiving what is beyond its “vision 5 ’, 
but it was not until Roentgen (1845-1923) discovered X- 
Rays, which enabled, him to see inside living tissue, that 
“vision” beyond the naked eye entered a new era [7]. In the 
following years imaging and sensing techniques have 
developed so rapidly that astronomy, medicine and geology 
are just few of the areas where sensing beyond vision has 
been found useful Finally, whether in electromagnetic, 
optical or acoustic sensing, the main intention of this 
research is to augment our understanding of the surrounding 
world [7]. 



Figure 1 - Human augmented sensing spectrum. 


vision (e.g. camera registration, and pattern recognition). 
Thus, the intention is to consider the human perception of 
the physical world achieved by correlating auditory sensing 
and visual sensing. 

Generally speaking, this paper describes the concept of 
humans acquiring information distantly about specific 
targets using alternative sensing approaches (multi-spectral). 
In this paper contrasts and distinguishes the ability to 
penetrate specific targets with the intent to reconstruct in 
detail the morphology of the target into knowledge 
(computer-based processing) from the “end-effecf’ 
perception (human perception) of physical world (desired 
result). Finally, what mainly constitutes sensing is the end 
effect of particular energy spectrum and the source is 
arbitrary in all natural dimensions - temporal, spatial, 
energy spectrum etc. - 


Objectives 

This paper describes a human-centered computing 
multimodal interfaces needs and deficiencies and it 
addresses the super-position of human sensing (visual and 
audio). It is tied with the requirements of human-vehicle 
system and human-environment interactions. 

A number of tools have been created to help address these 
deficiencies such as heads-up displays. However, the tools 
tend to specialize on only one of the limitations. Moreover, 
these tools do not capitalize on, to any significant extent, 
other human senses to augment the human visual system 
and to address the deficiencies described. Humans and 
primates rely heavily on vision to sense and react to the 
environment in order to achieve various goals namely, 
provide a capacity to sense beyond die human visible light 
range of the electromagnetic spectrum.. 


Figure 3 is an illustration of human augmented sensing 
capability beyond the visible spectrum. “Human perceives 
and interprets the world through their sensing. Altering the 
human sensing of the world will alter his perception of it” 

in 

The field of computer vision has already resulted in ample 
research ramifications in image interpretation, where visual 
perception aspects have gained a large influence. Numerous 
mathematical models for representation of the 
neurophysiological world, whether partial or complete, have 
been established into concepts and paradigms [8]. The 
immense complexity and evolution of the computational 
aspects . of vision has now given way to an increased 
perception, comprehension and understanding of the 
surrounding world. Despite advances in computational 
vision, this paper discusses again the nature of the human 
perception. The focus will be on the evolution of ideas 
rather than on models, and, likewise, the emphasis is placed 
more often on the classical foundations of the field human 
perception instead of the current approaches in computer 


The approach can also be applied to provide aural signal 
components representing shape signatures, sizes and 
estimated separation distances for objects that cannot be 
seen, or that are seen very imperfectly, because of signal 
interference, signal distortion and/or signal attenuation by 
the ambient environment This may occur in a hazardous 
environment where fluids present provide an opaque, 
darkened or translucent view of objects in the environment, 
including moving or motionless persons and objects that 
present a hazard. 

This interference may also occur in an airborne environment 
in which rain, snow, hail, sleet, fog, condensation and/or 
other environmental attributes prevent reasonably accurate 
visual perception of middle distance and far distance 
objects. A visual image component that, is likely to 
experience interference can be converted and presented as, a 
sequence of audio signal attributes that can be more easily 
or more accurately perceived or interpreted by an operator 
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of an aircraft (airborne or on the ground). The audio signal 
attributes may be extended to include estimated closing 
velocity between the operator/aircraft and the not-yet-seen 
object. 


3- Technical Approach 

Sensing Super-position as a way to augment the human 
vision system exploits the capabilities of the human hearing 
system. It is known that the human hearing system is 
capable of learning to process and interpret extremely 
complicated and rapidly changing auditory patterns, such as 
speech or music in a noisy environment. The available 
effective bandwidth, on the order of 15 kHz, corresponds to 
a channel capacity of several thousands of bits per second. 
The known capabilities of the human hearing system to 
learn and understand complicated auditory patterns provided 
the basic motivation for developing an image-to-sound 
mapping system. 



Figure 2 —Superposition as a head mounted device. 


Figure 2 illustrated the derived . device from an original 
sketch. This illustration demonstrates the concept of the the 
superposition device showing multiple sensoiy super- 
impositions. External miniaturized multi-spectral sensors 
signals are translated into sounds targeting the same human 
visual field. 

The human brain is far superior to most existing computer 
systems in rapidly extracting relevant information from 
blurred, noisy, and redundant images. From a theoretical 
viewpoint, this means that the available bandwidth is not 
exploited in an optimal way. While image-processing 
techniques can manipulate, condense and focus the 
information (e.g., Fourier Transforms), keeping the mapping 
as direct and simple as possible might also reduce the risk of 
accidentally filtering out' important clues. After all, 
especially a perfect non-redundant sound representation is 
prone to loss of relevant information in the non-perfect 


human hearing system. Also, a complicated non-redundant 
image-to-sound mapping may well be far more difficult to 
learn and comprehend than a straightforward mapping, 
while the mapping system would increase in complexity and 
cost. This paper will demonstrate some basic information 
processing for optimal information capture for head- 
mounted display systems. 

Technically, it is difficult to obtain unambiguous high- 
resolution input using scanning sonar, while any 
commercially available low-cost camera will do. It should 
be added, however, that this statement only holds if the 
camera input is supplemented by depth clues through 
changes in perspective, to resolve ambiguities in distance. 
Further, relative depth information can be derived from 
evolving relative positions of the viewer and his 
environment, combined with knowledge about the typical 
real size of recognized objects. Augmenting the mapped-to- 
audio visual motion with mapped-to-audio binocular vision 
lends itself to providing image sensing beyond the concepts 
of binocular extension, which in overall mimics the way 
current visual system operates. Research in computer and 
machine vision is still active in motion and stereoscopy 
operating concurrently or standalone as real time systems. 

In order to increase the image resolution obtainable via an 
auditory representation, mapping is performed to distribute 
an image in time. A somewhat related method has been 
applied by the British in the design of tactile displays. Im 
this paper, the three-dimensional spatial brightness and 
color maps of a visual image is processed using real-time 
image processing techniques (e.g. -histogram normalization) 
and transformed into a two-dimensional map of an audio 
signal as a function of frequency and time. The same basic 
functionality of mapping was considered in [2]. 

The proposed translation breaks down to three logical 
operations in the general image to sound mapping of 
grayscale sensor-based images as in thermal sensors. Each 
operation dealing with one fundamental aspect or dimension 
of the world projection to sensors: The first aspect is 
concerned with the first dimension (horizontal to the visual 
field) and establishes the image scan vector. The second 
operation is concerned with second image dimension 
(vertical to the visual field) and establishes the mapping the 
sound pitch. The third operation is concerned with the pixel 
intensity. 

Stereoscopy 

Extracting distance information using a single camera is 
difficult because the distance information in a single camera 
view is essentially ambiguous and requires much a priori 
knowledge and motion (multiple views) about the physical 
world to derive distances from recognized objects. By 
comparing the slight differences in images obtained from 
two different simultaneous viewpoints, the distances to 
nearby objects can be understood. However, the level of 
visual resolution in stereoscopic vision is exponentially 
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increased with dynamic scenes (motion) — where strong 
correlation between Hie slight differences captured through 
stereo and motion are sensed. 

The stereoscopic processing used in the proposed 
technology is in contrast to so-called anaglyphic often used 
in computer vision. Instead, two images are created from 
two distinct viewpoints from a single spectral signature. For 
thermal sensing, the two differently views are not altered. 
The proposed stereoscopy process is pushed to the brain 
level in sounds waves, but similar enough to the biological 
stereoscopic vision. Stereoscopy is an overlay process 
achieved by the human brain. In simulated environment 
however, sighted viewers can then again see the actual 
scene in 3D by looking at the anaglyph through red-green 
glasses : the red filter in front of the left eye blocks the green 
or cyan component and transmits only the left-eye red image 
while the green filter in front of the right eye transmite only 
the right-eye image. The human brain subsequently 
combines the slightly different views into one perceived 
three-dimensional view. Distance information is then ‘ 
apparent from something called ’’disparity", the small visual 
displacements of the red and green/cyan color components 
for nearby objects. Whereas the left-eye and right-eye view 
may coincide perfectly for far-away objects, the mismatch 
for a nearby object tells how close this object is. This is the 
principle upon which the proposed 3D stereoscopic sensing 
super-position is based on. 


4. Super-Sensing 

Since the human brain can combine slightly different views 
into one perceived three-dimensional view, distance 
information and resolution size (level of details) are then 
apparent from the stereoscopic disparity, i.e., the small 
visual displacements in two images. Whereas the left-eye 
and right-eye views may coincide indistinguishably for far- 
away objects, the mismatch for a nearby object tells how 
close this object is. However, the human eye is limited to 
acuity that is limited by the density of the cones and rods in 
die human eye. Since the brain can analyze the difference 
in displacements, the device can use a denser pixel 
distribution in a CCD layout and map this additional stereo 
vision information to spatialized sound. 

Challenges with the technology development 

There is no suitable hardware commercially available for 
the visor prototype, but an electronics engineer should be 
able to construct stereo sensor systems ("3D thermal 
sensors") from standard commercially available 
components. In the first prototype, we intend to use two 
regular thermal CCD cameras and research adequate 
disparity angles for the camera resolution and associate each 
camera with one eye and ear. We plan to represent 
individual intensity components directly as a function of 
reverberation or sound intensity. For example, it would 


render a bright red object with a strong reverberation and no 
intensity with any reverberation effects. An alternate 
approach to reverberations is the pixel intensity would be 
represented by the amplitude of the signal. In such case, it 
would render a bright red object with a strong loud signal 
and no intensity with any sounds. 

The coming decade of fast, cheap and miniaturized 
electronics and sensory devices opens new pathways for the 
development of sophisticated equipment to overcome 
limitations. To miniaturize this system, the work lies ahead 
in the process of dismantling and rebuilding a common low 
resolution CCD-based thermal cameras (no cooling needed) 
where the actual CCD array measures less than l A inch in 
size. 

While the research aspect of the proposed technology will 
be based on a computer with anaglyphic simulations, the 
actual miniaturization will take a harder pathway where 
electronic circuitry will used to map the CCD sensor array 
signals to sounds.. In all cases, use of a head mounted 
sensors is strongly recommended for most consistent head 
sensory feedback, and achieve sensors alignment with the 
visual axis. In the scope of this paper, the outcome of the 
simulation environment will set the course of the actual 
build. It is planned that two prototypes will be constructed. 

Geometric Variation and impact on Human Perception 

In this paper, one research aspect is planned to attempt a 
variation on geometric masked images and measure human 
sensitivity to feature changes in the field of view. This 
exercise is very tempting since it answers a fundamental 
science question and compensate for a natural human 
sensing deficiency. There is an inherited deficiency in 
human perception that lies in the fact that human visual field 
is horizontal in respect of the position of the human eyes. 
Horizontal feature (lines) sensing and localization in visual 
field is difficult when the feature lies in the middle and not 
in the foveated region. 



Figure d - A 3D rendering and audio tracks. 

Figure 3 demonstrate of a spherical field motion in 3D and 
the corresponding sound tracks. The first image from the 
left reflects a 3D spherical field at a distant in space. The 
image is computer generated using the dual anal glyphic 
Cyan/Red filter for immersive 3D perception and simulates 
two stereoscopic cameras . The second image from the left 
reflect the sounds tracks, namely the left and right channels. 
The overlap of the blue and red spheres implies minimal 
shifts In the left and right channels. Hie third and fourth 
images reflect the 3D sphere at closer distance and hence a 


4 


component. 


wide disparity angle between the red and biue spheres with 
the corresponding sound tracks and shifts. (Note the scan is 
horizontal, continuous, and symmetrical to central axis. The 
pitch is vertical). 


5. Super-position 

An example of supplementing, or replacing, visual signals 
by aural signals is presented here. 



Figure 4 - characteristics of an audible signal. 

Figure 4 graphically illustrates seven signal parameters that 
can be used to collectively characterize an undulating, 
audibly perceptible signal having a single information- 
bearing (envelope) frequency and a single carrier frequency. 
This signal can be characterized by: an envelope frequency 
f 0 ; a carrier frequency f c ; an envelope frequency phase <j> e at 
a selected time, t = t^; a carrier frequency phase f c at the 
selected time, t = t^; a baseline function b(t); a signal 
amplitude a(t) ~ aosin{f e t + (j) e } relative to the baseline curve 
B; and a time interval (duration) At for the signal. The 
human ear may be able to distinguish the phase difference, 
A<j> = <j) e - (j) c , but cannot distinguish the absolute phases, <j) e 
and/or <j> e . Thus, the maximum number of parameters for 
the signal shown in Figure 1 that may be distinguished by 
the human ear is six, if the (absolute) selected time, t = t p h, is 
not included. These six signal parameters may be used to 
audibly represent a corresponding visual component of an 
image, such as vertical location of the visual component 
(relative to a fixed two-dimensional or three-dimensional 
coordinate system), horizontal location of the visual 
component, component brightness; overall component 
brightness, and component predominant hue (color) or 
wavelength. Optionally, these audible signal parameters can 
be presented simultaneously or sequentially, for any 
corresponding visual image component that is so 
represented. In a sequential presentation, one or more 
additional audible signal parameters may be included, if the 
information corresponding to the additional parameter(s) is 
necessary for adequate representation of the image 



Figure 5 - Transforming an image to audio signals. 

Figure d schematically illustrates a mapping device used to 
transform selected visual image components associated with 
a region to an audible signal with audibly perceptible 
parameters. One or more visual image regions, represented 
as an assembly of visual signal components, is received and 
analyzed by a first signal receiver/processor f 4 R/F’). The 
first R/P analyzes a received visual signal component and 
provides one or more (preferably all) of the following visual 
signal characterization parameters: vertical location of the 
(center of the) region; horizontal location of the (center of 
the) region, region predominant hue (color) or wavelength, 
region average brightness and region peak brightness, using 
a region locator mechanism, a region predominant (or 
average) color sensing mechanism and a region brightness 
sensing mechanism. Output signals from the locator 
mechanism, from the color mechanism and from the 
brightness mechanism are received by a second R/P, which 
provides a collection of audible signal parameters, 
illustrated in Figure 4, that are determined, in whole or in 
part, by the output signals from the first R/P . 

As an example: the predominant or average color output 
signal from the color sensing mechanism can be used to 
determine the carrier frequency f c ; the brightness output 
signal from the brightness mechanism can be used to 
determine the envelope relative amplitude, ao or a(t);. the 
vertical and horizontal location output signals from the 
locator mechanism can be used to determine time duration 
At (if the visual image component locations are indexed by a 
one-dimensional index), or to determine time duration At 
and envelope frequency f e (if the visual image region 
locations are indexed using a two-dimensional index). 
Generally, the four visual signal parameters can be assigned 
to four of the six audibly perceptible signal parameters 
(Figure 4) in ( 6 4 ) = (6*5’4'3)/(4*3* *2T) = 15 distinguishable 
ways. More generally, N visual signal parameters can be 
assigned to M (>N) audibly perceptible signal parameters in 
( m n ) distinguishable ways. 

The analysis performed by each of the mechanisms is not 
instantaneous, and the associated time delays may not be the 
same for each analyzer. For this reason, an overall time 
delay 
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At(o 1 ) > min{ Allocation), At(color), At(brightness)} (1) 

is preferably imposed/ using a time delay mechanism, 
before an aural signal (or aural signal sequence) 
incorporating the one, two, three, four, five or six aural 
signal parameters is aurally displayed. If the aural signal 
parameters are displayed sequentially, rather than 
simultaneously or collectively, this time delay might be 
reduced or eliminated. The overall time delay is 
implemented by a time delay mechanism, which 
incorporates an appropriate time delay value for each of the 
aural signal parameters received from the R/P. A signal 
formation mechanism (optional) forms and issues either: (1) 
an audibly perceptible, ordered sequence of the set of 1-6 
aural signal components ASC(k), k ~1, ... , 6, (or a subset), 
or (2) a collective audibly perceptible signal APS 
incorporating the set (or a subset) of the aural signal 
components. The' output signal from the signal formation 
mechanism is perceived by the human. 



Figure 6 - Transforming an image to audio signals. 

The R/P, illustrated in Figure 6, includes one or more of the 
following; a carrier/envelope frequency (f c ) analyzer; an 
envelope-carrier frequency phase difference (A<j>) analyzer 
and baseline function (b(tj) analyzer that estimate the phase 
difference at a selected time and determines the baseline 
function; and a relative signal amplitude (a<> or a(t)) 
analyzer, relative to the baseline function at a corresponding 
time. 

The analysis- performed by each of these analyzers is not 
instantaneous, and the associated time delays may not be the 
same for each analyzer. For this reason, an overall time 
delay 

At(o2) > min{At(f e ), At(f c ), At(A<j>), At(b), At(a)} (2) 

is preferably imposed before an aural signal incorporating 
the one, two, three, four or five converted visual signal 
parameters is aurally displayed. If the converted visual 
signal parameters are aurally displayed sequentially, rather 
than simultaneously, this time delay might be reduced or 
eliminated. The overall time delay is implemented by a time 
delay mechanism, which incorporates an appropriate time 
delay for each of the aural parameters received from the 
R/P. A signal formation module forms a single composite 


aural signal representing an aural image component, and 
issues this component as an output signal. 

The signal shown in Figure 5 may be represented in a form 

Fvic(t) = b(t) + ao sm{f e (t-t,) + A<j>)}sm{f e (t-t*)} 

= b(t) + a 0 {cos{(f 0 -f e )(t-t 4 ,) - A<j>} a<i{cos{(f c +f e )(t-t^) 

+ A<j>} (3) 

The carrier/envelope frequency analyzer forms a sequence 
of correlation signals, computed over a time interval of 
length T, 

Cl = (1/1) J Fvic(t) sin{f es t) dt, (4A) 

Cl = (1/T) J Fvic(t) cos{f es t) dt, (4B) 

at each of a spaced apart sequence of “translated” carrier 
frequencies, f csi in a selected carrier frequency range, f cl < 
f cs < f c2 , where f cs is not yet known, and provides an estimate 
of two spaced apart frequencies f csi = f c + f e and f cs2 = f c - f e , 
associated with the VIC, where the correlation 
combination, Cl 2 + C2 2 , has the highest magnitudes. The 
envelope and carrier frequencies are then determined from 


fc = (fcsl+fcs2)/2 3 (5 A) 

fe-(fcsi-fo 52 )/2. (SB) 

The envelope-carrier phase difference Acj> and relative 
amplitude ao are determined by computing the correlations 

(1/T)J Fvic(t) sin{f e (t-t (jj )} dt = ao cosA<(>, (6A) 

(1/T)J Fvic(t) cos{fe(t-t 5j ,)} dt = ao smA<(>, (6B) 


from which the quantities ao (>0) and Acj> are easily 
determined. The baseline function b(t) is then determined 
from 

b(t) - Fvic(t) - ao{cos{ (f c -f e ) (t-t*) - A<j>} 

- ao{cos{(f c H-fe)(t-t 4> ) + A<j)}. (7) 

The frequency difference (f c -f e ) and frequency sum (f c +f e ) 
values are distinguished from each other in a normally 
functioning human auditory system if the sum-frequency 
difference 2f e is at least equal to a threshold value, such as 
500 Hz. 

The approach can also he applied to “enrich” image detail or 
manifest more clearly some image details that are not 
evident where the region is viewed solely with reference to 
visible light wavelengths. For example, some details of a 
region may be hidden or muddled when viewed in visible 
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wavelength light but may become clear when the region is 
illuminated with, or viewed by an instrument that is 
sensitive to, near-infrared light (wavelength X ~ 0.7 - 2 jam) 
or mid- infrared light (X~ 1 - 20 pm) or ultraviolet light (X 
< 0.4 pm). These hidden details can be converted to aural 
signal parameter values that are more easily audibly 
perceived as part of a received aural signal. Operated in this 
manner, the approach can separately compensate for a 
relatively narrow (or relatively broad) visible wavelength 
sensitivity of the viewer into a' relatively narrow (or 
relatively broad) auditory frequency sensitivity of the same 
viewer. Operated hr this manner, the visible wavelength 
sensitivity of a first (visual image) viewer of the image 
component can be adjusted and compensated for 
electronically by adjusting the aural wavelength range of 
one or more of the aural signal parameters, before the 
transformed aural signal is received by the same viewer or 
by a different viewer. 


6. Related Work 

John Zelek et al. of the School of Engineering at the 
University of Guelph, Canada, have developed a stereo- 
vision system for the blind using a tactile display for 
showing the nearest obstacles [18][19][20]. This work 
above advances research in image processing and 
segmentation techniques to level of target recognition. 

Another player in the domain mapping images to sounds is 
Yoshihiro Kawai as described in .[2.1] [22], Yoshihiro’s work 
focuses on object 3D spatial localization prior mapping to 
stereo sounds. 

However, what is the most relevant to the work proposed in 
this research will rely on neurophysiological mapping work 
encompassed in die literature such as die Sound Graph 
method as described in [10]. For example, Douglass et al 
demonstrates that mathematical concepts such as symmetry, 
monotonicity, and gradients could be determined using 
sound. Additional work in sound mapping work is found 
under [6]. A clear distinction the proposed work has, is the 
mapping the invisible, spectrum to the human perception 
bringing forth the human multi-sensing nature to level not 
achieved before. 

Competing technology & Current Status 

This technology competes with the approach of mapping 
recognized objects to English using natural language 
(speech). Sensing super-position warrants a one to'Vone 
mapping of natural stereoscopy and time varying scenes to 
audio signals. 

s 

To maintain focus on the potential and interests, the 
developmental prototypes, includes thermal sensing. For 
now, Thermal sensing super-imposition capability 
demonstrates anew way of perception inhuman sensing. 
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